-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Upgrade to text-embedding-3-large model as default, with vector storage optimizations #2470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check Country Locale in URLsWe have automatically detected added country locale to URLs in your files. Check the file paths and associated URLs inside them.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR upgrades the default embedding model to "text-embedding-3-large" (3072 dimensions) and implements several vector storage optimizations including truncation, binary quantization, and preserving original values for rescoring. It also introduces new environment variables for embedding field names and updates documentation and related code to support the new configuration.
- Updated tests and environment variables for the new embedding model and dimensions.
- Revised documentation to reflect model and deployment changes.
- Refactored search management and approach modules to use dynamic embedding field names.
Reviewed Changes
Copilot reviewed 24 out of 27 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/conftest.py | Updated mocked model and dimensions for the new embedding model. |
docs/gpt4v.md | Changed embedding model reference from ada to text-embedding-3-large. |
docs/deploy_features.md | Updated deployment instructions and embedding model references. |
docs/deploy_existing.md | Adjusted instructions for existing deployments to use the new model. |
azure.yaml | Added new environment variables for embedding field names. |
app/backend/prepdocslib/searchmanager.py | Refactored index creation to use dynamic embedding field names and profiles. |
app/backend/integratedvectorizerstrategy.py | Passed new search field names into indexer skill configuration. |
app/backend/prepdocs.py | Updated embedding field configuration from environment variables. |
app/backend/approaches/* | Modified constructors and vector field usages to accept new embedding field. |
app/backend/app.py | Integrated new environment variables for embedding field names in client setup. |
.github/workflows/azure-dev.yml & .azdo/pipelines/azure-dev.yml | Included new environment variable exports for embedding field names. |
Files not reviewed (3)
- app/backend/requirements.txt: Language not supported
- infra/main.bicep: Language not supported
- infra/main.parameters.json: Language not supported
By default, the deployed Azure web app uses the `text-embedding-3-large` embedding model. If you want to use a different embeddig model, you can do so by following these steps: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the word 'embeddig'. Please change it to 'embedding'.
By default, the deployed Azure web app uses the `text-embedding-3-large` embedding model. If you want to use a different embeddig model, you can do so by following these steps: | |
By default, the deployed Azure web app uses the `text-embedding-3-large` embedding model. If you want to use a different embedding model, you can do so by following these steps: |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
"id": self.id, | ||
"content": self.content, | ||
# Should we rename to its actual field name in the index? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pending question as to whether to send it down using the field of the index?
@@ -317,7 +321,8 @@ class ExtraArgs(TypedDict, total=False): | |||
**dimensions_args, | |||
) | |||
query_vector = embedding.data[0].embedding | |||
return VectorizedQuery(vector=query_vector, k_nearest_neighbors=50, fields="embedding") | |||
# TODO: use optimizations from rag time journey 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No changes are made to the actual search request, right?
We didn't make any here:
https://github.com/microsoft/rag-time/blob/main/Journey%203%20-%20Optimize%20your%20Vector%20Index%20for%20Scale/sample/3-Vector-Compression.ipynb
if field == "embedding" | ||
else await self.compute_image_embedding(query_text) | ||
await self.compute_image_embedding(query_text) | ||
if field.startswith("image") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bit tricky, feels a bit code smelly.
if field == "embedding" | ||
else await self.compute_image_embedding(q) | ||
await self.compute_image_embedding(q) | ||
if field.startswith("image") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same logic here
@@ -83,7 +85,7 @@ async def run( | |||
minimum_reranker_score = overrides.get("minimum_reranker_score", 0.0) | |||
filter = self.build_filter(overrides, auth_claims) | |||
|
|||
vector_fields = overrides.get("vector_fields", ["embedding"]) | |||
vector_fields = overrides.get("vector_fields", [self.embedding_field]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here, need to decide whether the frontend should render the actual field names.
Purpose
This pull request changes the default embedding model to text-embedding-3-large, with 3072 dimensions, along with these AI Search vector storage optimizations:
See this notebook for a demonstration of the effects of those optimizations. Due to the rescoring, the search quality remains high.
This PR introduces a new environment variable
AZURE_SEARCH_FIELD_NAME_EMBEDDING
so that developers can theoretically have multiple fields in their index, for different embedding sizes/models.This PR also changes the SKU for all models to GlobalStandard. It's becoming really tricky to find a region for the Standard SKU that works for all the models. Some developers may not be comfortable with GlobalStandard, depending on their regulations, so they can still change the SKU manually as desired.
Fixes #2383
Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.
Does this require changes to learn.microsoft.com docs?
This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.
Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
python -m pytest
).python -m pytest --cov
to verify 100% coverage of added linespython -m mypy
to check for type errorsruff
andblack
manually on my code.